Techniques for Efficient Learning without Search

نویسندگان

  • Houssam Salem
  • Pramuditha Suraweera
  • Geoffrey I. Webb
  • Janice R. Boughton
چکیده

Averaged n-Dependence Estimators (AnDE) is a family of learning algorithms that range from low variance coupled with high bias through to high variance coupled with low bias. The asymptotic error of the lowest bias variant is the Bayes optimal. The AnDE family of algorithms have a training time that is linear with respect to the training examples, learn in a single pass through the data, support incremental learning, handle missing values directly and are robust in the face of noise. These characteristics make the algorithms particularly well suited to learning from large data. However, for higher orders of n they are very computationally demanding. This paper presents data structures and algorithms developed to reduce both memory and time for training and classification. These enhancements have enabled the evaluation and comparison of A3DE’s effectiveness. The results provide further support for the hypothesis that as the number of training examples increases, decreasing error will be attained by members of the AnDE family with increasing levels of n.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reachability checking in complex and concurrent software systems using intelligent search methods

Software system verification is an efficient technique for ensuring the correctness of a software product, especially in safety-critical systems in which a small bug may have disastrous consequences. The goal of software verification is to ensure that the product fulfills the requirements. Studies show that the cost of finding and fixing errors in design time is less than finding and fixing the...

متن کامل

A Hybrid Unconscious Search Algorithm for Mixed-model Assembly Line Balancing Problem with SDST, Parallel Workstation and Learning Effect

Due to the variety of products, simultaneous production of different models has an important role in production systems. Moreover, considering the realistic constraints in designing production lines attracted a lot of attentions in recent researches. Since the assembly line balancing problem is NP-hard, efficient methods are needed to solve this kind of problems. In this study, a new hybrid met...

متن کامل

Lot Streaming in No-wait Multi Product Flowshop Considering Sequence Dependent Setup Times and Position Based Learning Factors

This paper considers a no-wait multi product flowshop scheduling problem with sequence dependent setup times. Lot streaming divide the lots of products into portions called sublots in order to reduce the lead times and work-in-process, and increase the machine utilization rates. The objective is to minimize the makespan. To clarify the system, mathematical model of the problem is presented. Sin...

متن کامل

Structural Reliability: An Assessment Using a New and Efficient Two-Phase Method Based on Artificial Neural Network and a Harmony Search Algorithm

In this research, a two-phase algorithm based on the artificial neural network (ANN) and a harmony search (HS) algorithm has been developed with the aim of assessing the reliability of structures with implicit limit state functions. The proposed method involves the generation of datasets to be used specifically for training by Finite Element analysis, to establish an ANN model using a proven AN...

متن کامل

Learning with Genetic Algorithms: An Overview

Genetic algorithms represent a class of adaptive search techniques that have been intensively studied in recent years, Much of the interest in genetic al­ gorithms is due to the fact that they provide a set of efficient domain-independent search heuristics which are a significant improvement over traditional "weak meth­ ods" without the need for incorporating highly domain-specific knowledge. T...

متن کامل

An improved opposition-based Crow Search Algorithm for Data Clustering

Data clustering is an ideal way of working with a huge amount of data and looking for a structure in the dataset. In other words, clustering is the classification of the same data; the similarity among the data in a cluster is maximum and the similarity among the data in the different clusters is minimal. The innovation of this paper is a clustering method based on the Crow Search Algorithm (CS...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012